[Bugfix] [Relay] Insertion of "device_copy" CallNode to Resolve Device Conflict on Unconstrained Nodes by lecoan · Pull Request #15090 · apache/tvm

lecoan · 2023-06-13T14:54:42Z

This PR addresses an issue #15019 I opened previously, regarding the PlanDevices pass's failure in cases where two operators share the same input but are intended to be assigned to different target devices. This scenario can often occur in the context of a neural network, where multiple layers can process the same input.

In the specific case of an operation (a+b)*(b+c), where the first add operator is assigned to the CPU and the second one to the GPU, PlanDevices pass would fail as it had difficulty determining the correct device for b.

The problematic behavior seemed to be due to PlanDevices pass marking b for the CPU (when it first visits a+b), and then throwing an error when it attempts to place b on the GPU while visiting b+c.

The solution I've implemented in this PR is the automatic addition of a device_copy. This means that if b is assigned to CPU after visiting a+b , the PlanDevices pass will append a device_copy to copy b to GPU when visiting b+c.

…cated inputs

tvm-bot · 2023-06-13T14:54:46Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @shingjan _{See #10317 for details}

_{Generated by tvm-bot}

lecoan · 2023-06-16T04:37:25Z

@mbs-octoml @Lunderberg Hi, all the test cases passed. Can you help me review the patch?

masahi · 2023-06-20T10:28:58Z

 *
 * Phase 1
 * -------
+ *  We iterate process the programs to find those nodes with conflicting virtual devices. If the


iteratively

Thanks for your review! My apologies for the spelling errors. I have double-checked the comment using ChatGPT to ensure accuracy.

masahi · 2023-06-20T10:29:27Z

+};
+
+/*!
+ * \brief Flows the device constraints over the module and find all the conflicted nodes. The


My apologies for the spelling errors. I have double-checked the comment using ChatGPT to ensure accuracy.

masahi · 2023-06-20T10:30:13Z

+  }
+
+  IRModule mod_;
+  std::unique_ptr<DeviceContext> dev_ctx_;


Why need pointer here?

The reason is similar to the PlanDevicesCore sub-pass, which uses a pointer for DeviceDomains to prevent unnecessary copying. Since the necessary information is contained in dev_ctx_, which is created in ConflictedNodeFinder and then passed to ConflictedNodeRewriter, we also use a pointer here.

masahi · 2023-06-21T02:16:18Z

 *
 * Phase 1
 * -------
+ * We iterately process the programs and find nodes with conflicting virtual devices. If the


Still wrong, "iteratively"

Fix: add a new subpass in PlanDevice to add device_copy op for confli…

33f7211

…cated inputs

lecoan mentioned this pull request Jun 14, 2023

[BUG] [Relay] PlanDevices Pass Failure when Two Operators with Different Target Devices Share the Same Input #15019

Closed

masahi reviewed Jun 20, 2023

View reviewed changes

Fix some spelling errors in comments

8331e47

masahi reviewed Jun 21, 2023

View reviewed changes

Fix some spelling errors in comments

7c422d9

masahi approved these changes Jun 22, 2023

View reviewed changes

masahi merged commit 6b20cae into apache:main Jun 22, 2023

ysh329 mentioned this pull request Jul 12, 2023

[Release] v0.13.0 Release Candidate Notes #15295

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] [Relay] Insertion of "device_copy" CallNode to Resolve Device Conflict on Unconstrained Nodes#15090

[Bugfix] [Relay] Insertion of "device_copy" CallNode to Resolve Device Conflict on Unconstrained Nodes#15090
masahi merged 3 commits intoapache:mainfrom
lecoan:fix/plan_device

lecoan commented Jun 13, 2023

Uh oh!

tvm-bot commented Jun 13, 2023

Uh oh!

lecoan commented Jun 16, 2023

Uh oh!

masahi Jun 20, 2023

Uh oh!

lecoan Jun 20, 2023

Uh oh!

masahi Jun 20, 2023

Uh oh!

lecoan Jun 20, 2023

Uh oh!

masahi Jun 20, 2023

Uh oh!

lecoan Jun 20, 2023

Uh oh!

masahi Jun 21, 2023

Uh oh!

lecoan Jun 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lecoan commented Jun 13, 2023

Uh oh!

tvm-bot commented Jun 13, 2023

Uh oh!

lecoan commented Jun 16, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants